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I.  Introduction. 


The  purpose  of  this  paper  is  to  describe,  compare  and  contrast 
two  mathematical  models  used  to  describe  movement  of  personnel  through 
a  hierarchical  organization. 

The  first  model,  which  has  received  considerable  attention  in 
the  literature  (for  example,  see  Bartholomew  (1967),  Gani  (1963), 

Thonstad  (1968))  assumes  an  underlying  stationary  Markov  chain  structure. 
The  important  point  of  this  type  of  model  is  that  it  uses  crosssectional 
data  of  an  organization  in  a  given  time  period,  and  predicts  what  will 
be  the  composition  of  the  organization  (i.e.,  the  cross  section)  in  the 
following  time  period(s).  A  major  advantage  of  such  a  method  is  that 
it  requires  little  data. 

The  second  model  considered  here  is  of  the  cohort  type.  This 
model  follows  each  group  of  newly  entering  people,  called  a  cohort,  over 
their  lifetimes  in  the  organization.  Cross  sectional  structure  in  any 
time  period  is  found  by  considering  the  super-position  of  the  remaining 
members  of  all  the  previously  entering  cohorts.  Although  more  appealing 
from  a  theoretical  viewpoint,  this  model  typically  requires  considerably 
more  data  than  the  Markov  model. 

The  Markov  model  is  described  briefly  in  section  II  and  the  Cohort 
model  in  detail  in  section  III.  In  section  IV  an  attempt  is  made  to 
compare  theoretically  the  two  models.  Under  certain  conditions  the 
models  give  essentially  the  same  results.  The  results  of  the  analysis 
show  that  under  stationary  conditions  the  Markov  method  gives  a  good 
approximation  to  the  movement  through  an  organization,  and  since  its 
data  requirements  are  small  such  a  model  may  be  preferred.  However, 
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for  organizations  with  changing  or  controlled  cohort  sizes  the  fractions 
which  appear  in  the  Markov  method  should  be  changed  from  year  to  year 
and  the  model  gives  no  functional  relationship  of  the  model  parameters 
to  the  sizes  of  the  cohorts.  In  the  cohort  method  the  parameters  appear 
as  functions  of  the  cohort  sizes,  and  so  in  non-stationary  situations, 
the  cohort  method  may  be  preferred  for  long  range  forecasting. 

In  section  V  some  enrollment  predictions  are  made,  using  both 
the  Markov  chain  and  Cohort  Models,  of  student  enrollments  at  the 
University  of  California,  Berkeley.  These  are  compared  with  actual 
enrollments. 
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II.  The  Markov  Model. 

The  Markov  chain  model  has  been  discussed  in  detail  in  the  liter¬ 
ature  (for  example,  see  Bartholomew  (1967),  Gani  (1963),  Marshall  et  al 
(1970)),  but  to  unify  notation,  and  for  completeness  and  clarity  we 
formulate  it  here.  Throughout  the  paper  we  assume  a  system  made  up  of 
n  active  states. 

At  each  time  period  it  is  assumed  that  people  can  stay  in  the 

same  grade,  can  move  to  other  grades,  or  can  leave  the  system.  New 

inputs  are  added  to  the  continuing  or  promoted  people  in  each  grade. 

Possible  movement  is  shown  schematically  in  figure  1  for  3  grades. 

Let  X^t)  be  the  number  in  grade  i  at  time  t,  i  €  P,  and 

let  X(t)  be  a  row  vector  (X^(t) , . . . ,Xn(t)) ,  where  |P|  -  n.  Let 

E  [X(t) ]  be  the  vector  of  expected  numbers  in  each  rank  at  t  +  1, 
m 

given  the  vecto*  x(t) .  Then 


E  [X(t)]  -  X(t)Q(t)  +  y(t+l),  (1) 

m 


where 


q13  (t)  . qln(t) 

q2l<t). 


Q(t)  - 


qnl(t)’qn2(t) . qnn(t) 


(2) 


and 


y(t)  -  (y,(t),...,y  (t)). 
i  n 


A 


The  vector  y(t+l)  is  a  vector  of  new  inputs  into  each  grade 
at  time  period  (t+1) ,  i.e.,  y^(t+l)  »  number  new  people  who  enter 

grade  i  at  t+1.  The  matrix  Q ( t )  has  the  structure  of  the  tran¬ 
sient  part  of  a  Markov  chain  matrix,  and  q^it)  is  the  fraction  of 
those  in  i  at  t  who  will  move  to  j  at  t+1. 

The  main  advantage  of  this  model  is  that  only  a  small  amount  of 
data  is  required  to  estimate  the  coefficients;  only  the  grade  of  each 
person  in  the  last  two  time  periods  is  required. 

Although  the  name  "Markov-Chain"  method  gives  the  connotation 
of  a  stochastic  model,  in  most  instances  this  model  is  treated  in  the 
literature  in  terms  of  expected  values  only,  and  hence  can  be  considered 
to  be  deterministic.  However,  using  the  probabilistic  interpretation 
of  the  Markov  chain,  it  is  assumed  that  the  probability  a  person  is 
promoted  to  state  j,  given  he  is  now  in  i,  is  independent  of  how 
long  he  has  been  in  i,  or  how  he  got  into  i.  This  seems  an  unreason¬ 
able  assumption.  In  section  III  we  formulate  a  "cohort"  model  of 
movement  through  a  system  of  grades  which  is  based  on  more  reasonable 
assumptions.  In  section  IV  we  compare  the  two  models. 
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III.  A  Cohort  Model. 

People  who  enter  into  a  system  in  the  same  grade  and  in  the  same 
time  period  are  referred  to  as  a  cohort.  For  example,  all  freshmen 
entering  a  given  university  in  a  particular  academic  quarter,  or  all 
officers  entering  into  the  U.S.  Navy  as  Ensigns  with  regular  commission 
in  a  given  fiscal  year  would  be  considered  in  each  case  to  form  a  cohort. 

After  some  time  the  people  in  a  given  cohort  will  be  found  in 
various  grades  in  the  system,  and  some  will  have  left.  We  can  think 
of  the  people  in  a  given  grade  at  some  time  as  coming  from  many  previously 
entering  cohorts.  Indeed,  everyone  in  the  system  entered  in  some  cohort. 
The  cross-sectional  structure  in  a  given  time  period  can  be  thought  of 
as  the  result  of  the  superposition  of  the  remnants  of  all  previously 
entering  cohorts.  Figure  2  gives  a  schematic  representation  of  the 
cohort  model. 

Let  there  be  n  different  types  of  cohorts  which  enter  the  system. 
For  example,  students  can  enter  a  university  as  freshmen,  sophomores, 
juniors,  or  seniors.  Let  y^(u)  be  the  number  who  enter  in  cohort  i 
at  time  u.  Let  k  index  the  people  in  a  given  cohort.  Thus  define 

(k) 

(u,t)  =1  if  person  k  of  cohort  i  which 
entered  at  u  is  in  j  at  t, 

*  0  otherwise, 
for  k  -  1,2, .. . ,yi(u) . 

We  shall  assume  that  all  cohorts  behave  independently  of  eacl 
other  and  that  all  members  of  a  given  cohort  have  independent  behavior. 

Let  zjk)(u,t)  -  (zjk)(u,t),...,z{k)(u,t)),  k-  l,2,...,yi(u).  Thus 


5  yrs 
4  yrs 


5  yrs 


5  yrs 


Illustration  of  Cohort  Model  with  3  grades. 
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we  have  a  set  of  y^u)  independent  and  identically  distributed 
n-dimensional  vectors. 

Now  let 

Plj(U,t)  =  Pr[Z^)(u,t)  -  1],  (3) 

and 

P(u,t)  =  [p  ^  j (u ,  t )  ] , 

the  n  *  n  matrix.  Then 

E[Z^(u,t)]  »  (p11(u,t) . Pln(u,t>). 

Since  we  are  interested  in  relating  *he  positions  of  people  in 
consecutive  time  periods,  define  the  2n-vector 

[zjk) (u, t) ,Z^k) (u, t+1) ] 

=  [Z^k) (u,t) , . . . ,zjk) (u,t) ,zjk) (u,t+l) . Z^k> (u,t+l) ] . 

Let  X^(u,t)  be  the  number  of  people  in  j  at  t  who  entered  in  i 
at  u.  Also  let  (X^(u,t)  .X^u.t+l)  ]  be  the  2n-vector  of  X^(u,t), 

X  (u,t+l),  j  -  l,2,...,n.  Then 

yi(u>  (r)  (j,-) 

[Xi(u,t),X1(u,t+l)]  =  l  [Z^(u,t),Z^(u,t+l)J.  (4) 

k=l 

From  our  assumptions  this  vector  is  the  sum  of  y^(u)  independent 
and  identically  distributed  vectors,  and  thus  for  large  cohort  sizes 
the  [X^u.t)  ,X1(u,t+l)  ] ,  i  -  1,2,. ...n,  u  <  t,  are  each  approximately 
normally  distributed  (see  for  example,  chapter  4  of  Anderson  (1958)). 

We  shall  assume  that  cohorts  are  large  enough  for  normality  assumptions 


to  hold. 


9 


Let  Xj(t)  be  the  number  in  grade  j  at  time  t  and  let 

X(t)  -  (X.(t) . X  (t)).  Then 

i  n 

n 

[X(t),X(t+l)]  -ll  [X. (u,t),X(u, t+1) ]  +  [0,y (t+1) ] ,  (5) 

ust  i-1  1 

where  y(t+l)  is  the  n-vector  of  new  inputs  at  t+1,  and  0  is  an 
n-vector  of  zeros.  Again  we  have  a  sum  of  independent  random  vectors 
They  are  not  identically  distributed,  but  if  each  is  approximately 
normal,  then  the  2n-vector  [X(t) ,X(t+l) ]  has  a  multivariate  normal 
distribution.  In  terms  of  the  original  Z  vector  random  variables, 

[X(t)  ,X(t+l)  ]  =  l  I  y4U)[Z.(k)(u,t),Z(k)(u,t+l)]  +  [0,y (t+1)  ] .  (6) 

ust  i=l  k=l 

In  forecasting,  what  we  need  is  the  conditional  expectation 
E[X(t+l) | X(t) ] .  It  is  well  known  that  (see  Anderson  (1958),  chapter  2) 
for  the  multivariate  normal  distribution, 

E [X(t+1) | X(t) ]  -  E[X(t+l) ]  +  [X(t)  -  E[X(t)]]B(t)_1C(t),  (7) 

where  B(t)  is  the  n  x  n  covariance  matrix  of  elements  of  X(t) , 
and  C(t)  is  the  n  x  n  covariance  matrix  of  elements  of  X(t)  with 
corresponding  elements  of  X(t+1), 

To  compare  this  result  with  equation  (1)  we  let  E[X(t+l) | X(t) ]  ■ 

E  [X (fc) ] v  and  write  (7)  as 

Bc[X(t) ]  -  X(t)B-1(t)C(t) 

+  y (t+1)  +  [EiX(t+l)J-y(t+l)  -  E[X(t)]B_1(t)C(t)].  (8) 
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Equation  (8)  has  the  same  linear  structure  as  (1),  but  the  coeffi¬ 
cients  appear  to  be  quite  different  from  those  of  the  Markov  chain 
model.  We  explore  this  further  in  section  IV.  However,  we  shall  need 
to  know  the  structure  of  B(t)  and  C(t)  in  more  detail,  and  now  find 
them  in  terms  of  the  cohort  sizes  and  the  underlying  probability  distri¬ 
butions. 


Structure  of  B(t) . 

Recall  that  B(t)  is  the  covariance  matrix  of  the  elements  of 
X(t).  Thus  b^(t)  -  CovfX^Ct) , Xj  (t)  ]  where  X^(t)  Is  the  number  in 
state  i  at  time  t.  From  (6)  we  have 

n  yi(u)  ...  .  . 

Cov(X(t)  ,X(t+l)  ]  *  l  l  1  Cov[<Z.W(u,t),Z.W(u,t+l)], 

u*t  i-1  k-1  1  1 

The  expression  for  B(t)  in  terms  of  the  original  probability  distri¬ 
butions  is  given  in  equation  (9) .  Note  that  B(t)  is  symmetric  with 
off  diagonal  terms  negative  and  diagonal  terms  positive.  Now  define 

u  (t)  »  E[X  (t)],  the  expected  number  in  state  j  at  time  t.  Then 
J  n 

u. (t)  -  £  £  y  (u)p  (u.t).  Let  M(t)  be  the  diagonal  matrix  with 

J  u<t  i-1 

diagonal  elements  y^(t).  Also  define  Y(u)  to  be  an  n  x  n  diagonal 
matrix  with  diagonal  elements  y^u).  With  these  definitions  (9) 
simplifies  considerably  and  we  have 

B(t)  -  M(t)  -  l  P(u,t)TY(u)P(u,t),  (10) 


where  T  denotes  transpose  and  the  P  matrices  are  given  by  (3). 
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Structure  of  C(0- 

Earlier  we  defined  C(t)  to  be  the  covariance  matrix  of  elements 
of  X(t)  with  those  of  X(t+1).  Thus  c  (t)  =  Cov[X.(t),X  (t+l)J. 

J  *-  1  *. 

Define  the  joint  distribution 

n1J4(u,t)  =  Pt^jj }  (u»t)  =  l.zjjhu.t+l)  =  1], 


all  k  =  (u).  Then 

n 

C.„(t)  =  l  l  y  (u)(u  (u,t)  -  P ,.(u,t)p  (u,t+l) ; 

J*'  u£t  i-1  1  lj£  U 


(11) 


Let  X  (t)  be  the  expected  number  of  people  who  move  from  grade  j 

1  X 

at  t  to  grade  l  at  t  +  1,  and  let  A(t)  =  [X. .  (t)],  an  n  x  n 

J  X- 

matrix.  Then  from  (11)  and  the  definition  of  A 


C(t)  =  A(t)  -  l  P (u, t)  i (u) P(u , t+1 ) . 
u£t 


(12) 
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IV.  Model  Comparison. 

In  this  section  we  compare  the  two  estimators  E  and  E  for 

m  c 

the  Markov  and  Cohort  models  respect ively .  Taking  Che  stochastic 
interpretation  of  the  Markov  model  we  see  that 


Using  (16)  together  with  (14)  and  (15)  we  find  that 

Em  "  Ec  =  [u(t)  -  X(t)][B-1(t)C(t)  -  Q(t)J.  (17) 

Equation  (17)  is  useful  in  comparing  the  two  models.  If  in  some 
period  t  the  actual  distribution  of  personnel  coincides  with  the 
expected  distribution,  the  models  will  give  the  same  forecasts  for 
period  t  +  1.  "On  average"  the  difference  between  the  two  models' 
forecasts  will  he  zero  but  for  a  given  period  the  difference  will  depend 
on  the  size  of  |B  ^(t)C(t)  -  Q ( t )  1 . 
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1  s  i  ng  (12)  we  can  write 


C(t)  =  A ( t )  -  F(t), 


(18) 


where  we  'nave  let 

F(L)  =  l  ;’iu,t) rY(u)P(u,t+l) . 

U-£t 

Similarly,  from  (10) 

B(t)  =  M(t)  -  t;(t)  ,  (14) 


who  re 

G(t)  =  y  P(u,t)TY(u)P(u,t). 
ust 

Using  (18)  and  (19)  with  (13)  we  find  that 

lB_1(t)C(t)  -  Q(t) )  =  B~l(t)[c:(t)Q(t)  -  F(t) ) .  (20) 

Now  if  motion  through  the  system  is  Markovian  (possibly  non- 
st.it  ionary)  ,  then 

P(u,t+1)  =  P(u,t)Q(t), 

and  the  expression  in  (20)  is  zero.  This  shows  the  expected  result  that 
it  motion  through  a  graded  system  is  truly  Markovian  then  the  cohort 
model  and  Markov  chain  model  give  identical  forecasts. 

Since  movement  between  grades  is  typically  non-Markovian ,  we  wish 
to  investigate  further  the  error  given  by  (17).  We  shall  do  this  hv 
looking  I  art  her  it  (Mt)l>(i)  -  Kit)  for  some  special  cases. 
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Single  Grade  Case. 

Let  us  consider  the  case  where  we  have: 

Al.  The  system  has  a  single  grade  (n=l)  , 

A2.  At  each  time  period  all  input  cohorts  are  the  same  (y(t)=y), 

A3.  The  life  distribution  of  each  person  in  the  system  is  stationary. 

With  these  assumption;  the  models  and  their  corresponding  nota¬ 
tion  simplify  considerably.  No  subscripts  are  required  on  the  distri¬ 
bution  p,  and  if  L(u)  is  the  lifetime  in  the  system  of  a  person 
entering  at  u,  then 

Pr[L(u)  >  t  -  u]  =  p(u,t) 

=  p(t-u)  under  A3. 


If  y  is  the  constant  cohort  size  for  u  jc  t  (we  cannot  claim  y(t+l)  =  y 
and  that  (17)  holds  simultaneously),  then 


G=  l  yp(t-u)2,  M=  l  yp(t-u). 

mtt  u£t_ 

A  =  l  yp(t+l-u) ,  F(t)  =  l  yp (t-u)p (t+l-u) . 
uatt  u£t 

All  these  are  independent  of  t. 

Now  let  H  -  E[L]  *  £  p(t-u).  Then 

U£t 


l  P(u)2  l  p(u+l)  -  l  p(u+l)p(u)  l  p(u) 
1*0  u«0  u*0  u*0 


(21) 


The  term  in  parenthesis  in  (21)  is 

l  p(u)2(i-l)  -  i  l  p(u)p(u+l)  “  l  l  A(u+l)p(u)  -  l  p(u)2,  (22) 

uiO  uiO  uiO  uiO 

where  A(u+1)  *  P [ L  *  u  +  1]  *  p(u)  -  p(u+l). 
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Interpreting  p(u)  as  the  tail  distribution  of  a  non-negative 
random  variable,  one  can  show  that 


and 


l  P(u)[l  -  P(u)]  =  l  A (u)  l  p(v), 
uiO  uaO  vau 


l  [A (u)  +  A(u+l)]p(u)  =  1. 
uiO 


Using  (22),  (23)  and  (24)  in  (21)  gives 


(23) 


(24) 


GQ 


F 


I  A(u)  l  p(v)  -  (Jl)p(u) 
uaO  vsu 


(25) 


Let  us  assume  now  that  the  expected  remaining  lifetime  of  a  person 
whose  time  in  the  system  exceeds  u  time  periods  is  no  more  than  the 
expected  lifetime  S,  of  a  new  input.  We  say  that  people  have  "mean 
residual  life"  bounded  above  by  the  original  mean  life,  and  say  that 
L  has  MRLA  if 

£  S.  I ,  all  u  -  0,1,2,...  for  which  p  (u)  >  0 . 

ViU  P''U'' 

Note  that  equality  holds  in  this  equation  for  the  geometric  distribution. 
Table  1  shows  that  in  a  particular  case  of  students  attending  the 
University  of  California  at  Berkeley,  this  assumption  is  valid. 

Under  the  MRLA  assumption,  from  (25)  we  see  that 


GQ  -  F  s  0. 


(26) 


Recall  that 

E  -  E  -  [M  -  X(t)]B-1[GQ  -  FJ. 
m  c 


(27) 
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Since  B  ^  is  non-negative,  we  have  the  following  conclusion  under 
the  above  four  assumptions: 

If  in  addition  to  A1  -  A3  we  assume  L  has  MRLA, 

a)  If  X(t)  <  p ,  then  E  £  E  and  the  Markov  model  under-estimates 

m  c 

the  value  of  E[X(t+l) j X ( t ) 3 , 

b)  If  X(t)  >  p,  then  E  s  E  ,  and  the  Markov  model  over-estimates 

m  c 

the  value  of  E[X(t+l) jX(t) ] . 


TABLE  1 :  Mean  Residual  Life  of  Freshmen  Students  Entering 
U.C.  Berkeley  in  Fall  Semester,  1955. 


Lifetime 

(semesters) 

u 

Pr[L  >  u]* 

=  p(u) 

l  P(u) 

ViU 

l  p(u)/p(v). 

ViU 

0 

1.000 

6.959 

6.96 

1 

0.972 

5.959 

6.14 

2 

0.905 

4.987 

5.52 

3 

0.756 

4.082 

5.42 

4 

0.684 

3.326 

4.86 

5 

0.593 

2.642 

4.47 

6 

0.562 

2.049 

3.65 

7 

j.  "24 

1.487 

2.84 

8 

0.49d 

.936 

1.88 

9 

0.199 

.465 

2.34 

10 

0.130 

.266 

2.05 

11 

0  050 

.136 

2.72 

12 

0.036 

.086 

2.39 

13 

0.017 

.050 

2.94 

14 

0.015 

.033 

2.20 

15 

0.011 

.018 

1.64 

16 

0.007 

.007 

1.00 

Source  data  found  in  Suslow  et  al  (1968),  [5]. 


Since  X(t)  has  a  marginal  normal  distribution  we  can  say  more 

about  the  expected  error  in  the  one  dimensional  case.  (E  -E  )  is  a 
r  me 

normal  random  variable  with  zero  mean,  and  variance  equal  to  B  ^(GQ-F)2 
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(where  these  are  all  scalars) .  Thus  we  can  say  that  with  probability 

about  .95  the  error  (Em-Ec)  will  lie  in  the  interval  (-2  B-1^2|GQ  -  f| , 

+2  B  |GQ  -  F|).  The  length  of  this  interval  is  a  function  of  the 

1/2 

cohort  size  y,  and  increases  as  y  .  The  expected  value  of  X(t), 

p,  increases  as  y.  Thus  the  interval  length  divided  by  p,  or  the 

1/2 

fractional  error  range,  decreases  as  y  .  So  as  y  increases,  and 
hence  p  increases,  the  width  of  the  confidence  interval  of  error 
increases  much  more  slowly.  To  illustrate  this  we  use  the  lifetime 
distribution  from  table  1,  and  for  various  cohort  sizes  we  show  how  the 
interval  length  changes.  The  results  are  given  in  table  2.  It  is  clear 
from  this  table  that  even  though  the  lifetime  distribution  differs 
considerably  from  a  Markovian  (geometric)  distribution  with  the  same 
mean,  the  confidence  intervals  on  -  Ec  are  extremely  small  relative 
to  the  expected  number  in  system,  p.  For  comparison  p(u)  is  drawn 
in  figure  3  together  with  a  geometric  distribution. 


TABLE  2:  95Z  Confidence  Intervals  for  E  -  E 

m  c 

for  various  Cohort  Sizes. 


Cohort  Size 

y 

m 

* 

Confidence 

Interval  for 

E  -  E 
m  c 

1000 

6,959 

(-7,7) 

2000 

13,918 

(-10,10) 

3000 

20,877 

(-12,12) 

4000 

27,836 

(-14,14) 

* 

Based  on  lifetime  distribution  in  table  1. 


Probability 
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Figure  3:  Comparison  of  p(u)  for  UCB  Students  with  a  geometric  distribution 
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Multlgrade  System. 

Lee  us  now  relax  assuuiptioi.  Al,  but  keep  the  assumptions  A2 
and  A3  of  constant  cohort  sizes  and  stationary  distributions  respec¬ 
tively.  Let  Y  be  the  diagonal  matrix  of  cohort  sizes  at  each  time 

period.  Define  L  ■  £  P(u),  where  P(t-u)  ■  P(u,t).  Under  such 

uiO 

stationary  conditions  p(t)  »  p  independent  of  t,  and  if  y  is 
the  n-vector  of  cohort  sizes,  then  from  expected  value  arguments 

uQ  *  u  -  y- 

Thus 

p  -  y(I-Q)  1  and  also 

p  -  y  l  P(u) 

uaO 

-  y  L. 

Since  these  relationships  hold  for  all  y,  L  “  I  -  Q  \  and  finally 

Q  -  I  -  L'1.  (27) 

Using  (27)  with  the  definitions  of  G  and  F,  we  have  that 

GQ  -  F  -  l  P(u)TY(P(u)(I-L_1)  -  P(u+1)].  (28) 

USO 

Recall  from  (17)  and  (20)  that 

E  -  E  -  [p  -  X(t) ]B-L [GQ  -  F]. 
m  c 

It  is  easy  to  show  that  B  ^  is  non-negative,  but  the  conditions  under 

which  E  >  E  ,  or  conditions  for  this  inequality  to  hold  for  some 
m  c 
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element  1  are  much  more  complex  than  in  the  single  state  case.  Let 
A(u+1)  »  P(u)  -  P(ufl).  Then  the  multi-dimensional  equivalents  of 
(23)  and  (24)  are 

I  {I  -  P(u)X]Y(u)P(u)  -  l  A(u)T  l  Y(v)P(v),  (29) 

uiQ  uiO  viu 

and 

l  [P(u)TYA(u+l)  +  4(u)TYP(u)J  «  Y.  (30) 

uiO 

Note  that  (30)  only  holds  for  Y  a  stationary  matrix,  whereas  in  (29) 

Y(u)  can  change  over  time. 

Using  (29)  and  (30)  in  (28)  gives  as  the  multidimensional  equiva¬ 
lent  of  (25), 

GQ  -  F  -  l  A(u)TY[  l  P(v)L_1  -  P(u)].  (31) 

uaO  viu 

Although  this  equation  has  great  similarity  to  (25)  it  is  quite  different. 

r  -1 

Even  if  one  can  say  something  about  the  sign  of  £  P(v)L  -  P(u), 

ViU 

it  is  usually  true  that  A(u)  is  not  non-negative,  as  in 

the  single  dimensional  case.  Also  of  course  the  elements  of  [p  -  X ( t ) ] 

can  differ  in  sign,  so  that  the  conditions  for  each  element  of  E  -  E 

m  c 

to  be  either  negative  or  positive  do  not  seem  simple  or  natural. 

Equation  (28)  seems  to  be  the  most  useful  for  computation  purposes. 

Note  that  (E  -E  )  has  a  multivariate  normal  distribution  with  mean  0 
in  c 

T  -IT 

and  covariance  matrix  (GQ-F)  (B  )  (GQ-F) .  Using  the  data  given  in  the 
appendix  for  freshmen,  sophomores,  juniors  and  seniors  at  the  University 
of  California,  Berkeley  1955-1969,  some  calculations  were  made  assuming 
constant  cohort  sizes  of  3000  freshmen,  700  sophomores,  1300  juniors 
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and  150  seniors  entering  each  fall  semester.  These  figures  are  approxi¬ 
mately  what  the  Berkeley  campus  has  been  experiencing  in  its  fall  new 
admissions . 

Table  3  gives  the  matrix  B,  whose  (i,j)ttl  element  is  the 
covariance  of  Xi(t)  and  Xj (t)  for  some  t.  Also  included  is  p, 
the  vector  of  expected  values  of  numbers  in  each  state. 


TABLE  3:  Covariance  Matrix  B  for  the  4-state  example. 


^State  j 

Fresh 

Soph 

Jun 

Sen 

State  i^^ 

Fresh 

673 

-454 

-30 

-10 

Soph 

-454 

1453 

-380 

-43 

Jun 

-30 

-380 

2137 

-535 

Sen 

-10 

-43 

-535 

2216 

Expected 

Values 

3868 

3324 

4687 

3227 

The  variance  of  the  number  in  each  state  increases  as  the  state 

increases,  and  all  states  are  negatively  correlated. 

T  -1 

Table  4  gives  the  matrix  (GQ-F)  B  (GQ-F) ,  which  is  the  covar¬ 
iance  matrix  of  the  error  (E  -E  ).  It  can  be  seen  that  these  numbers 

m  c 

are  very  small  compared  to  the  size  of  the  predicted  values,  as  was 
found  in  the  single  state  case. 
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TABLE  4:  Covariance  Matrix  of  E  -  E  . 

m  c 


The  matrix  B  ^(GQ-F)  is  given  in  table  5. 


TABLE  S:  B  1(GQ-F)  for  the  4-state  example. 


This  is  an  example  of  where  (GQ-F)  is  neither  2  nor  «  0, 
unlike  the  single  state  case. 

Even  though  movement  through  the  system  is  far  from  that  repre 
seated  by  a  stationary  Markov  Chain,  (i.e.,  P(u)  i  Pu  for  some  P) 
when  constant  cohort  sizes  are  used  the  Markov  Chain  Model  gives 
essentially  the  same  prediction  as  the  more  complex  cohort  model. 
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However,  Che  Cohort  Model  was  primarily  formulated  for  forecasting 
under  conditions  of  controlled  input.  This  is  the  situation  when  academic 
planning  is  implemented,  and  under  such  conditions  the  sizes  of  cohorts 
in  successive  time  periods  can  and  do  vary  considerably.  For  example,  the 
freshmen  cohorts  in  the  fall  quarters  at  Berkeley  in  the  period  1966-1969 
are  shown  in  table  6.  This  was  a  period  when  total  campus  enrollment  was 
controlled,  and  new  students  entered  only  to  fill  available  room. 

TABLE  6:  Freshmen  Cohort  Sizes  at  U.C.  Berkeley 


Date 

Cohort  Size 

Fall  1966 

3,053 

Fall  1967 

3,303: 

Fall  1968 

2,239 

|  Fall  1969 

1,883 

One  can  see  from  equation  (13),  since  X(t)  and  u(t)  are  both 
functions  of  previous  cohort  sizes  (up  to  period  t) ,  that  the  Markov  chain 
transition  probabilities  will  change  with  time,  and  that  estimating  them 
from  cross-sectional  data  in  two  consecutive  years  will  not  account  for 
changes  in  cohort  sizes.  In  the  next  section  we  make  forecasts  one  year 
ahead  with  both  models  and  compare  the  results. 
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V  Enrollment  Forecasts. 

In  this  section  we  use  data  up  to  the  spring  quarter  of  1970  at 
Berkeley  to  forecast  continuing  and  returning  undergraduate  students  at 
the  freshman,  sophomore.  Junior  and  senior  levels,  in  the  fall  quarter 
of  1970.  Both  the  Cohort  and  Markov  Chain  models  are  used,  and  results 
compared  with  actual  enrollments. 

In  applying  the  Cohort  Model  directly,  three  problems  appeared, 
all  associated  with  the  start-up  and  operation  of  the  quarter  system  at 
Berkeley. 

The  first  winter  and  summer  quarters  were  offered  in  1967.  The 
fractions  of  students  who  entered  in  these  quarters  and  were  enrolled  in 
F69  (this  notation  will  be  used  in  this  section.  F69  means  fall  quarter 
1969)  are  now  applied  to  cohorts  entering  in  the  winter  and  summer  of  1968 
when  forecasting  for  F70.  It  would  certainly  be  expected  that  some  students 
from  the  winter  and  summer  quarters  of  1967  would  also  be  enrolled  in  F70, 
but  how  many?  We  have  no  fractions  for  winter  or  summer  1966.  These 
fractions  have  to  be  estimated  in  some  reasonable  way.  An  average  was 
taken  of  the  fractione  from  F65  and  Sp66,  for  the  winter  quarter  and  from 
$p66  and  F66  for  the  summer  quarter. 

The  third  problem  that  arose  was  in  deciding  what  fractions  to  apply 
to  the  students  who  entered  in  Su69.  These  students  had  available  only  the 
winter  and  spring  quarters  of  1970  before  F70.  The  students  who  entered 
in  Su68  could  attend  winter,  spring  and  summer  quarters  before  F69.  It  was 
felt  that  larger'  fractions  of  Su69  entrants  would  attend  the  fall  of  1970 
than  the  fractions  of  Su68  students  attending  F69.  But  how  much  larger? 
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To  estimate  attendance  of  Su69  entrants  it  was  assumed  that  the  same 
fraction  of  these  would  attend  F69  as  did  Su6S  entrants  in  F68.  Of  these 
that  enrolled  in  F69 ,  they  were  then  assumed  to  behave  in  the  same  way  as 
new  entrants  in  F69. 

Besides  these  three  particular  and  rather  confusing  problems,  the 
stationarity  of  most  of  the  fractions  since  the  start  of  the  summer  quarter 
can  be  questioned.  With  such  a  major  change  in  campus  operations  it  will 
take  a  number  of  years  to  settle  down  even  if  there  were  no  changes  between 
3-quarter  and  4-quarter  operations. 

The  Markov  Chain  Model  was  used  in  the  following  way.  The  transition 
matrix  from  F68-F69  was  determined  by  finding  the  fractions  of  those  en¬ 
rolled  in  each  grade  in  F68  who  were  enrolled  in  each  grade  in  F69.  This 
matrix  is  shown  in  table  7. 


TABLE  7:  Markov  Chain  Matrix  for  F68-F69  at  Berkeley 


If  this  is  applied  to  F69  enrollments,  the  prediction  for  F70  will 
have  Ignored  new  inputs  in  W70  and  Sp70  (the  summer  quarter  1970  was  not 
held).  To  make  a  fair  comparison  the  same  fractions  of  these  were  assumed 
to  enroll  in  F70  as  was  assumed  in  the  Cohort  Model. 
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Table  8  shows  the  forecasts  from  the  two  models  together  with  the 
actual  figures.  It  can  be  seen  that  the  cohort  model  gave  significantly 
better  predictions  that  the  Markov  Chain  method.  This  is  not  surprising, 
since  these  forecasts  are  made  for  a  period  of  much  instability  on  the 
Berkeley  campus,  both  in  student  behavior  and  in  academic  policy. 


TABLE  8:  Enrollment  forecasts  for  Fall  1970  at  Berkeley, 
Continuing  and  Returning  Students 


Freshman 

Sophomore 

Junior 

Senior 

Total 

Markov  Chain 
Model 

958 

2,737 

4,356 

4,189 

12,240 

Cohort  Model 

1,115 

3,018 

4,508 

4,670 

13,311 

Actual 

1,591 

3,136 

4,632 

4,261 

13,620 

Actual 
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APPENDIX 

Data  used  in  calculations  in  table  3.  The  time  periods  u  are 
in  years.  The  data  is  from  many  different  cohorts,  and  each  number  is 
the  fraction  of  a  particular  cohort  who  were  enrolled  at  U.  C.  Berkeley 
in  the  given  class  in  the  fall  quarter  of  1969.  Let: 

State  1:  Freshmen 
State  2:  Sophomores 
State  3:  Juniors 
State  4:  Seniors. 

Example: 

p^3(3)  -  fraction  of  students  who  entered  as  freshmen 
in  Fall  1966  who  registered  as  juniors  in 
Fall  1969.  (0.281). 

T 

Time  u  P(u)  y(u) 


29 


Time  u 


2 


3 


4 


y(u)T 

3303 

843 

1662 

175 


P(u) 


.003 

.003 

.004 

.01? 

3620 

.001 

.004 

.007 

728 

6 

.001 

.003 

1569 

0  _ 

199 

All  nuabers  are  rounded  off  to  3  figures.  For  more  detail  see 
Marshall  and  Sualow  (1971) . 
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